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Characterizing  Macro  Functions  with  Cross  Section  Data 


1.   Introduction 

Although  econometricians  have  been  estimating  time  series  macro  functions 
over  average  data  for  decades,  it  is  rare  to  find  studies  which  take  into  account 
the  "averaged"  character  of  the  data.   Even  in  the  simplest  consumption  function 
regression  of  average  consumption  on  average  income,  one  is  only  capturing  the 
statistical  relation  between  two  stimmary  statistics  of  the  underlying  consumption- 
income  distribution.   Unless  saving  behavior  is  virtually  identical  across 
individuals  or  the  structure  of  the  income  distribution  is  extremely  limited, 
aji  average  consumption-average  income  regression  will  not  completely  describe  the 
true  economic  structure  relating  average  consumption  to  the  income  distribution. 

In  the  same  sense,  any  macro  function  relating  average  data  is  likely  to  ignore 

2 
many  important  distributional  influences. 

To  completely  characterize  the  true  population  structure  underlying  a  macro 
function,  one  must  specify  precisely  the  micro  process  that  connects  the  individual 
coii5)onent  distribution  over  time.   As  will  be  reviewed  shortly,  work  in  aggregation 
theory  has  produced  several  micro  model-distribution  schemes  which  justify 
certain  macro  function  formulations.   But  to  just  state  such  a  justification  is 
not  enough,  as  in  general  any  violation  of  the  underlying  assumptions  will  signi- 
ficantly alter  the  true  form  of  the  macro  function.   Moreover,  all  of  the  approaches 
provided  by  aggregation  theory  either  involve  linearity  assumptions  at  the  micro 
level,  or  very  strict  distribution  assumptions,  such  as  requiring  a  normal  distri- 
bution for  the  population  of  micro  variables. 

The  import  of  this  discussion  is  that  one  really  needs  to  utilize  micro 
data  to  study  any  macro  function  justification  which  is  to  be  taken  seriously. 
If  cross  section  data,  or  micro  data  from  a  single  time  period  is  available,  then 
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one  can  in  principle  completely  characterize  the  micro  distribution  as  well  as  the 
interactive  structure  of  the  micro  variables  under  various  sets  of  modeling  assiomptions. 
But  even  this  process  is  likely  to  be  very  imprecise,  leaving  large  portions  of 
the  observed  data  configuration  linexplained.   Also,  because  cross  section  data  is 
only  observed  for  one  time  period,  the  distribution  is  held  constant,  and  so 

the  impact  of  time  series  distributional  movements  may  not  be  captured  by  this 

3 
process. 

The  purpose  of  this  paper  is  two-fold:   first  to  discover  the  conditions  \inder 
which  cross  section  data  is  informative  about  macro  functions  and  second,  given 
these  conditions,  what  information  simple  statistical  analyses,  such  as  least 
squares  micro  regressions,  can  provide.   These  issues  are  addressed  without  any 
strong  micro  relation  or  population  distribution  assumptions. 

The  process  of  using  micro  data  to  approximate  macro  functions,  termed 
"statistical  aggregation  analysis"  in  the  title,  rests  on  the  fact  that  a  randomly 
sampled  cross  section  data  base  will  accurately  mirror  the  true  underlying  popu- 
lation distribution.   The  approach  discussed  here  is  the  use  of  simple  micro 
statistics  to  empirically  characterize  interactive  distributional  structure  which 
bears  on  the  macro  relation,  without  recourse  to  strict  modeling  assumptions. 

Because  of  the  prominence  of  linear  micro  relations  in  aggregation  theory, 
we  begin  our  analysis  by  studying  the  conditions  under  which  least  squares  slope 
coefficients  will  accurately  describe  the  macro  relation.  We  find  that  under 
asymptotic  sufficiency  of  the  average  explanatory  variables  for  determining  the 
average  dependent  variable,  micro  slope  coefficients  will  consistently  estimate 
the  first  derivatives  of  the  true  macro  function. 

Asymptotic  sufficiency  is  a  property  which  holds  in  virtually  all  of  the  exDlicit 

k 
aggregation  schemes  which  appear  in  the  literature.   In  particular,  one  can 

guarantee  it  by  assuming  a  linear  f\mctional  relation  at  the  micro  level, 
with  either  constant  coefficients  across  population  members  (exact  aggregation) 
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or  coefficients  which  vary  independently  of  the  micro  explanatory  variahles 
(consistent  aggregation).   Either  of  these  strategies  implies  a  linear  macro 
relation  in  the  population  averages.  Alternatively,  one  can  guarantee  asymptotic 
sufficiency  "by  requiring  the  average  explanatory  variables  to  be  sufficient  for 
the  i^arameters  of  the  underlying  population  distribution.   Here  the  macro  function 
is  unrestricted  in  form,  and  so  for  this  case  ve  present  a  methodology  for  esti- 
mating and  testing  the  values  of  all  higher  order  derivatives  of  the  macro 
function  using  cross  section  data. 

In  section  2  we  begin  by  presenting  the  notation  to  be  used  as  well  as  a 
brief  review  of  the  aggregation  literature.   Section  3  contains  the  major  results 
on  the  use  of  micro  regression  analysis  to  study  macro  functions.   Section  h 
then  presents  elements  of  statistical  aggregation  analysis  under  distributional 
sufficiency.  Finally,  section  5  gives  a  summary  and  conclusion. 

2.  Preliminaries 

2.1  Notation  and  Basic  Framework 

For  our  discussion  we  assume  that  there  is  a  large  population  of  individuals 

in  T  time  periods  with  periods  indexed  by  t  =  1, ,T.   There  are  N*^  individuals 

in  period  t,  indexed  by  n  =  1,...  .N*^  .   For  each  agent  n  in  period  t,  there  is  a 

vector  of  personal  attributes  A  .   For  given  t,  A  is  assumed  to  capture  all 

differences  in  individual  agents,  whether  observable  or  not.   Also,  for  each 

period  t  and  each  agent  n  there  is  a  dependent  quantity  x  ,  which  we  ass\ime  is 

determined  by  A  via 

n  


(2.1)  x*  =  f(A*,Y^) 

n      n 

where  y^   is  a  vector  of  parameters  which  given  t  are  constant  over  all  agents, 
but  may  vary  over  time. 

Now,  for  each  t  the  set  (a"^  |n=l , . . .  ,N*}  may  be  considered  as  a  random  sam- 
ple from  a  distribution  with  density  p(A|6^).5   6  =  (6^^, ,6^^)'  is  an  L  - 
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vector  of  parameters  which  account  for  all  changes  in  the  underlying  distribution 

p(A|e^)  over  time  t.   We  denote  the  parameter  space  of  6  as  T ,  where  V   = 
{e^e  R^|p(A|e^)  is  a  density],  where  R  is  L  dimensional  Euclidean  space. 

We  begin  with  this  general  framework  in  order  to  allow  for  virtually  all 

6 
types  of  assximptions  on  the  underlying  population  structvire.    Assumptions  on 

f  of  (2.1)  are  referred  to  as  functional  form  assumptions,  whereas  assumptions 

on  p  are  referred  to  as  distributional  assumptions. 

For  each  period  t,  we  observe  the  following  average  statistics 

N*  N't 

(2.2)         4=     Ix^/nS  tK  =  IV\)/N^    m=l,...,M 

n=l  n=l 

where  v  (A*),  m=l , . . . ,M  are  observable  fimctions  of  the  underlying  attributes 
m     n 

A   ,  with  M  <  L.      Denote  the  vector    {v^{A    ),.,., v,,(A    ))'  as  v(A    )   and  the  vector 
n  —  inMn  n 

_+      _+      .+■ 

(V,„,. . . ,V.„,)'  as  V„.  Our  primary  interest  here  is  in  the  relation  between 
^  IN'    '  MN       N 

x^  and  V  ,  referred  to  as  the  macro  relation. 

As  an  example,  suppose  that  we  are  studying  consumer  demand.   Here  x  can 
represent  the  demand  for  a  particular  commodity  by  family  n  in  year  t ,  v  (A  ) 
family  income,  v  (A  )  family  size  and  v  (A  )  a  qualitative  variable  indicating 

whether  the  family  has  rural  residence.  Then  x^  represents  average  quantity 

—^  ~~t  — t 

demanded,  V   average  income,  V   average  family  size  and  V   the  percentage 

of  families  in  the  economy  with  rural  residences.   Similarly,  in  a  production 

context,  X  may  represent  the  output  of  firm  n  in  period  t,  v  (A  )  its  capital 

stock  and  v  (A  )  its  labor  input.   Then  xL^  is  average  firm  output,  V-..,  average 

capital  stock  and  V   average  labor  input.   In  either  of  these  cases  elements 

common  to  all  agents,  such  as  prices,  are  capturable  through  the  fimctional 

t  7 
form  parameters  y   . 

We  make  the  following  assumption  concerning  the  population  struct'ore: 


(2.3) 


ASSmjPTIOM  A.l:  All  first  and  second  order  moments  of  x^  and  v(A*)  given  t  exist 

n        n 

i)  E(x*|e*)  =  /f(A\Y*)p(A^|e^9A*  =  4.(9\y*) 
ii)   E(v^(A*)|e*)  =  y^      m=  1,...,M 
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(2.M 


and 


L  J  XX 

(x"^  -  4.(e\Y*))(v^(A*)  -y*)Ie*]  =  a 


m  =  1 , . . .  ,M 

eRv   (A^)  -  mI){vAa^)  -  y^)|e*l  =  a'^ 
[_    ™  mm  m      '      J  m 


m,m  =1 , . . .  ,M 


and  the  covariance  matrix  of  v(A    ): 

n 


w 


^     t  t 

11  "mi 


•  •       • 

^     t  t 

O        •  •    •    a 

Ml  "mm 


is  nonsingular. 


As  notation,  we  collect  the  other  moments  in  matrice 


es  as 


'^V 


(2.5)     Vl- 


m 


xl 


XV 


xM 


From  (2.3)  i),  we  see  that  the  mean  of  x  can  be  written  as  a  fimction 
of  the  distributional  parameters  9*  and  the  fimctional  form  parameters  y*- 

In  order  to  ascertain  the  large  sample  relationship  between  ?  and  f*^ ,  we 

K       N 


reparameterize  E(x  |e  )  =  (})(e  ,y  )  in  terms  of  y  .   Rewriting  (2.3)  ii) 
ul   =  L(a'^)p(A*|0*)3A*  =  g^(e*) 


as 


(2.6) 


y^'  =  /vjA*)p(A*|e^aA*  =  g^(e*) 


or  in  vector  format  as  y^  =  g(e^),  gives  y*  as  a  function  of  9^.  We  next  adopt 
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ASSUMPTIOH  A. 2:  y^  =  g(e*)  is  invertible  in  eJ,...,eM.  conditional  on  the 
value  of  e^  =  (eli--->ej^)  if  L  >  M,  or  unconditionally  if  L  =  M  ^ 


M+i"-""l^  ^'  ^  ^  M,  or  unconditionally  if  L  =  M.°  Moreover, 

we  assume  that  the  ranee  of  c  if      fo-CA  1 1  fic-rl    ^    ■ 

^  ^»  ^•^-  tg^»;|9er]  contains  an  open  convex  set  $ 

M 

m  R  ,  with  the  realized  values  y"""  =  s(6h  ,,T  _  „/nT.  .  ^   . 

^v   ^^"  S  •  • -'^v  ~  S^   ^  interior  points 


^M 
of  •I'- 


Performing  this  inversion,  we  can  reparameterize  p(A"'^|e*)  as  p*(a'^[p*  ,60  ) ,  so 
that  mean  x  can  be  written  as 
(2^^  E(x^Iy^,eb=  /f(A\Y'^)p*(A*Iy^,e^9A* 

Our  final  background  assiimption  is 


ASSUMPTION  A. 3:   V  4i  (y  .Qo.Y  )  exists  for  all  y  e$,  where  V  denotes  the 

V 

gradient  operator. 


Conditional  on  60  ,  <^*   represents  the  correct  large  sample  relationship 


between  x^  and  V*,  because  by  the  Weak  Law  of  Large  Numbers. 


(2.8)  plim  ^  =  <()*(y^,e\Y^);  plim  V*  =  y* 
so  that  if  N  is  large,  we  have 

(2.9)  xt   =^*(vj,et^^t) 

In  addition  to  the  macro  data  (2.2),  we  also  observe  a  random  sample  of 
K  agents  in  a  particular  period  t°  ;  a  cross  section  data  base.   We  index 
members  of  this  sample  by  k=l,...,K  and  therefore  have  as  data  x  °  ,v(A*° )  ,k=l , , 

K       K. 
to 
We  assume  that  K  is  smaller  than  N    but  still  large  enough  to  employ  large 

sample  statistical  results  .  In  this  paper  our  major  interest  is  in  what  can 

be  learned  from  these  micro  data  about  ^    ,  the  macro  function.   In  particular. 


1 


> 
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in  the  next  section  we  study  the  relation  of  the  derivatives  of  <p      to  the 

/\  t         t  t 

slope  coefficients  b  which  result  from  regressing  x  °  on  v  (A  ° ) ,. . . ,v  (A  ° ) , 

k=  1,...,K  by  least  squares,   b  is  given  as  the  solution  of  the  normal 

K 

equations : 


(2.10)         S*°  L  =  S^ 
w  K    XV 


o 


where  S  **  is  the  MxM  matrix  with  i,j  element 
w 


K 

.toN       :rrto\/      /.t 


L     (^^^°)-^K)(-j(^k)-^K) 


k=l 


K 


J-  J.-!. 

S   °   is  the  Mxl  matrix  with   i        element 

XV 

K 

_ 

— t       t       t    ^     t 
and  xj^°  =  I  ^k°/^'  ^iK  "  ^  "^i  ^'^k"  ^  .   By  standard  methods,  as  K  grows 
K— 1  lc-=l 


K 

(2.11)         Z*°  plim  V  =  ^^°   or   Plim  b^  =  (J:*°  )"V° 
w  j^^   K    XV        K-^   K     w    XV 

This  concludes  the  presentation  of  the  basic  framework  and  notation. 


2.2  Previous  Work 

We  are  now  in  a  position  to  discuss  previous  approaches  to  the  aggregation 
problem.  As  pioneered  by  Houthakker  (1957) »  a  variety  of  studies  have  appeared 
which  specify  both  the  distribution  p  and  the  functional  form  f  exactly.   The 

macro  f\inction  is  then  found  by  direct  integration.  A  survey  of  this  work  can 

12 

be  found  in  Fisher  (1969). 

Of  more  recent  interest  are  linear  aggregation  schemes,  which  involve 

relatively  weak  distributional  assiimptions.   Exact  aggregation  models  first  arose 

1^ 
out  of  the  work  of  German  (1953) , followed  later  by  Muellbauer  (1975,  1977). 
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Tlie  most  important  ajid  general  result  motivating  these  models  is  given  by 

Lau  (1977  a  -  b  ) ,  which  is  stated  briefly  as:   Suppose  that  for  all  underlying 

configurations  of  A  ,  x^  can  be  written  as 

(2.12)        x^  =  F(y  ,g,(A^,.  ..  ,Aj^t),...  ,gj^(A^,...,A^t) 

where  g,  ,  m  =  1 , . . . ,M  is  a  symmetric  function  of  A  , . . . ,A^t >  then  under  some 

general  conditions  we  must  have     jT"t 

I  v  (A^^ 


(2.13) 


i)  g  (A^      t  ,  ^n=l  °^^   =  yt  ,  in=  l....,M 


M 

ii)   X*  =  f(A*,Y'^)  =  C(y*)  +  Ih^(Y*)v  (A*) 
n      n  ''mm 

m=l 
M      _^ 

end   so       iii)  x^*=  C(Y^)  +  Ih^(Y*)V^ 

m=l 

With  no  distributional  restrictions,  the  form  (2.12)  requires  x  to  be  a  linear 
function  in  v(A  ),  with  constant  coefficients  across  the  population,  giving  also 

a  linear  macro  relation.   An  additive  residual  can  clearly  be  incorporated 

in  X  without  disturbing  the  aggregation  structure,  and  if  uncorrelated  with 

v(A  ),  the  micro  slope  coefficents  defined  in  (2.10)  will  consistently  estimate 

h  (y^°),  m  =  1,...,M. 
m 

Consistent  aggregation  approaches  assume  a  linear  micro  relationship 
together  with  coefficients  that  vary  randomly  across  the  population.    If 
the  coefficient  variation  is  uncorrelated  with  v(A^),  then  the  macro  function 
is  linear  with  coefficients  equal  to  the  means  of  micro  coefficient  distri- 
butions.  If  in  addition,  the  coefficient  variation  is  independent  of  v(A^), 
then  the  micro  slope  estimator  (2.10)  will  consistently  estimate  the  macro 
coeffients  for  period  to-   The  restrictions  on  the  coefficient  variation 
amount  to  partial  distributional  assumptions  which  allow  the  constant  micro 
coefficient  featiore  of  exact  aggregation  models  to  be  relaxed. 
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2. 3  A  Word  on  Methodology 

In  consistent  and  exact  aggregation  approaches,  a  linear  macro  formulation 
is  obtained  with  either  partial  or  no  distributional  assiomptions.   Here,  by 
viewing  the  agent  population  in  a  given  year  t  as  a  drawing  from  the  distri- 
bution with  density  p(A  |G  ),  we  can  add  more  structure  to  the  configuration 

ft      t  I 

lA]^  , . . .  ,A^t  J  that  exists  in  the  population.   Our  posture  is  then  to  characterize 

« 

the  theoretical  macro  relation  <\>     implicit  from  p  and  f ,  by  using  an  observed 

random  sample  from  a  cross  section  survey.   This  approximating  method  thus 
affords  a  more  realistic  basis  upon  which  to  study  distributional  influences 
on  macro  relations. 

The  emphasis  on  large  sample  techniques  is  required  to  guarantee  that  the 
observed  macro  data  behaves  in  accordance  with  the  theoretical  macro  function, 
as  in  (2,9).   A  sensible  approach  to  aggregation  in  small  samples  must  either 
be  based  on  modeling  expected  values  or  must  contend  with  all  possible  con- 
figurations of  underlying  population  attributes.   Modeling  expected  values 
in  a  small  sample  context  is  virtually  equivalent  to  large  sample  modeling 
as  is  done  here.   As  shown  by  the  work  of  Lau,  for  an  average  equation  to  be 
invariant  with  regard  to  all  possible  configurations  of  attribute  vectors , 
the  model  must  be  of  the  exact  aggregation  type.   Thus  it  appears  that  discussion 
of  small  sample  approaches  will  yield  few  new  results.   Moreover,  populations 
of  states,  countries,  etc.,  are  generally  quite  large,  so-8iat  large  sample 
approaches  are  empirically  Justified.   Various  authors  have  argued  the  need 
for  emphasis  on  large  sample  techniques,  notably  Theil  (195^>1975)  and  Green 
(I96I1). 

Another  issue  in  aggregation  theory  is  that  of  recoverability  of  precise  micro 

functions  from  macro  function  parameters.   In  an  exact  aggregation  model,  values 

of  the  macro  coefficients  h  (y  ),m=l,...,M  will  dictate  the  values  of  the 

m 
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micro  coefficients,  and  thus  the  micro  functions  are  recoverable  from  the 
macro  function  parameters.   In  general  this  property  is  not  available  in 
large  sample  approaches.  Even  in  consistent  aggregation,  estimates  of  the 
macro  coefficients  give  only  the  means  of  the  micro  coefficient  distributions. 
With  arbitrary  underlying  distribution  structures,  recoverability  requires 
an  exact  aggregation  format. 

3.   Micro  Regressions  and  Macro  Functions 

3.1  The  Main  Result 

In  this  section  we  show  the  main  result  that  under  certain  conditions 
micro  slope  regression  coefficients  as  defined  in  (2.10)  will  consistently 

A 

estimate  the  first  derivatives  of  the  macro  function  d)  with  respect  to  \i 

V. 

For  the  majority  of  this  section  we  will  consider  only  the  time  period  t  , 
that  in  which  the  cross  section  data  is  observed,  and  so  we  omit  the  time 
superscripts. 

Of  central  importance  to  this  inquiry  is  the  conditional  expectation 
of  X   given  V  ,  denoted  x^ 

(3.1)     ^-  E  (I^IV^,  y^^  0^) 

In  general,  x^  is  a  function  of  five  arguments;  V  »  y  ,  0  ,  Y»  N,  so  that 


we  write 

(3.2)     5^  =  X  (V^,  y^,  e^,  Y,  N) 

We  require  tliat  x  obey  some  regularity  properties,  as  summarized  in 


ASSUMPTION  A.  A:  x   exists  and  is  continuous  and  dif ferentiable  In  V^  for  all 

N,  and  V-  i^^ ^approaches  a  finite  limit  G  iV^Q^,   Y)  ?«  0  as  N  approaches 

N 

Infinity  and  V  approaches  y 
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We   can   obtain   the   following  result      concerning  the   large  sample  behavior 


LEMMA   3.1: 

a)      Under  Assumptions   A. 1   and  A. 2,   we  have   that    as   N   increases 
plim  x^  =    0*    (M^,    e^,   Y);    pUm  V     =  u 


and   that   the  asymptotic  distribution   of 
Xj^  -  <P*(y   ,  60,  Y) 


/n 


^N-   ^ 


is  multivariate  normal  with  mean  zero  and  variance  covariance  matrix 

XX    XV 

XV  w 
b)   Under  Assumptions  A.  1,  A.  2  and  A.  4,  as  N  ^  «> 

^  [x  (V  y„,  e„,  Y,  N)  -  ((>*(y  ,  e„,  y)] 


converges  in  distribution  to 

^  (\  -  ^)'  G  (y^,  e^,  Y) 

Proof:   Part  a)  is  a  standard  application  of  the  Weak  Law  of  Large 
Numbers  and  the  Central  Limit  Theorem.     Part  b)  is  shown  in  the 
Appendix.  QED. 


We  are  now  in  a  position  to  prove  the  first  important  result: 

THEOREM  3.2:   Consider  the  micro  slope  regression  coefficients 

b„  as  defined  in  (2.10),  obtained  by  regressing  x^  on  v  (A^)  (and  a 

constant)  in  a  micro  random  sample.   Under  Assumptions  A.l,  A. 2  and 

A. 4,  we  have  plim  b„  =  G(y  ,6  ,  y) 
K->  -   '^      "^   °  ___ 
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Proof:   Multiply  v^  (x^  -  cj)   (p^,  6^,  y))  by  t/S'  (V^^  -  y^) 

and  take  the  expectation,  giving 

=  E  (N  (x.,  -  d 
vx 


E^  =  E  (N  (x^  -  *   (y^,e^,  Y))  (V^  "  P,)) 

which  expands  as 

^vx  =  E  (N  (^  -  (^*  (y^,  <}.^,Y))  (V^  -  U^)) 
+  E  (N  (^  -  i^)   (V^  -  y^)) 

=  E  (N  (i^  -  ^*  (y^,  e^,  Y))  (Vn  -  y^)) 

where  the  second  term  vanishes  by  first  conditioning  on  V   and 
then  taking  the  overall  expectation. 
We  also  clearly  have  that 

E  (N  (V^-y^)  (V^-  V')  =S^ 

Applying  Lemma  3.1  b),we  obtain  the  equality 

lim  E  (N  (5^  -  <i>*    (y^.  0^,  Y))  i\   -   V) 
N-w  _        _ 

=  lim  E  (N  (Vj,-  y  )  (V^  -   y^)')  G  (y^,  9^,  y) 

or,  in  view  of  the  above  developments 

Z   =1   G  (y  ,  9  ,  y) 
vx     W      V    o 

Now,  as  presented  in  (2.11),  the  normal  equations  defining  b  yield  that 

/\ 

Z   =1   plim  b„        which  implies 


plim  b-.  =  G  (y  ,  9  ,  y) 

„  IS.  V    o 

K  -*<» 


QED. 


From  applying  results  of  the  Central  Limit  Theorem,  we  have  just  shown 
that  least  squares  slope  coefficients  from  a  randomly  sampled  cross  section 
will  consistently  estimate  the  large  sample  partial  derivatives  of  x^  with 
respect  to  V  .    In  order  to  relate  this  result  to  the  derivatives  of  the 
macro  function  (j)  ,  we  must  be  very  specific  about  the  role  of  V   in  x^.   Ii) 


-13- 


order  to  clarify  the  roles  of  V^  and  y^  in  i,  we  introduce  an  M  -  vector  of 
dummy  arguments  \Jj  and  rewrite  x  as 
(3.3)   \=i   (^,   V^,  e^,Y.  N)|^=  v^ 


Similarly  the  gradient  of  x  with  respect  to  V^  is  rewritten  as 

(3.4)   V-  S=  V  i  (i|;,y   e^,  Y,  N)  I    - 
N  ^    N 

As  above,  we  have  pointwise  convergence  of  these  two  functions  as  in 


11m  5  (y^.  A^,    e„.  Y.  N)  =  (J)   (y^,  e^,  Y) 
N  ^<» 

(3.5) 

11m  V,i  (y  y  e^  Y.  N)  =  G  (y  6   y) 
N  ^  ^    v,  V,  o,  v,  o. 

Now,  in  order  to  remove  some  pathological  cases  from  our  analysis,  we 
adopt  the  following  assumption  on  the  gradient  vectors  y.   x  and  y  x 


ASSUMPTION  A. 5:      V  5  converges  uniformly^^  to  a  vector  function  G**    (^     u    ,    Q     v) 

^  V    o 

as  N^.Also  V  X  exists  and  converges  uniformly  to  a  vector  function 

V 

H  (\p,M^,    e^,  y)  as  N  ^. 

A. 5  implies  that  x  converges  to  a  function  cj)   (ijj,  y  ,  6  ,  y)  as  N  ->«>. 
From  (3.5)  we  have 


(3.6) 


"^     ^^'  ^'  0o'  ^>  =  G  (y^,  e^.  Y) 


**  -  * 


*     ^%'  ^'  ^o'Y  )  =  *  (^.'    e  .  Y) 


V   o"  '    ^  ''^v'    o- 

19 
and  by  the  uniform  convergence  assumption 

V^cf.  (ii>,   y^,  e^,  Y)  =  G  (ip,  y^,  9^,  y) 
(3-^)so  V/*  (y^,  y^,  0^.  Y)  =  G  (y^,  6^,  y) 

and 

"kit 

(3.8)    V  4)  (i|i,  y^,  9^,  Y)  =  H  (^,   y^,  0^,  y) 
v 
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We  can  now  decompose  the  gradient  of  the  macro  function  <p*   with  respect 


to  y  as 

V 


(3.9)     V   ,   (,^,  6^.  „  .  v^*   (u^,  „^,  e^.  „  ,  ,   ,"  (  ^, 

V 

=  G   (y^,    e^,   Y)   +  H   (y^,    y^.    9^,   y) 
In  view  of   this   discussion,   we  have  shown 

THEOREM  3.3:      Under   Assumptions  A.l,   A. 2,   A. 3,   A. 4   and  A. 5 
Plim  b^  =   V^^   <^^y^.    e^,    ,) 

if  and   only   if 

"   %'   \'    ^o'   Y^    =  °- 

Thus,  at  a  given  point  in  time,  the  micro  regression  coefficients 
will  consistently  estimate  the  first  derivatives  of  the  macro  function 
<t>     if  and  only  if  V^  <t>       vanishes  at  that  point.   For  such  regression 

V 

coefficients  to  always  consistently  estimate  these  first  derivatives,  we 
must  require  that  V^  <p       vanish  at  all  possible  parameter  points.   In  this 


JL  JU 

case  we  can  omit  reference  to  y  in  (j)   ,  giving 

(3.10)  *       ^   . 

where  we  have  returned  to  our  convention  of  denoting  the  time  period  by  a 
t  subscript. 

'•^is  condition  is  important  enough  to  merit  a  name: 

—  t  —  t 

DEFINITION:   V   is  asymptotically  sufficient  for  determining  2Sm  conditional 

on  6  ^  if 
o 

lim  X  (ii'.u  ^  9^^  Y^  n^^)  =  /*  ci^,  e  ^  y^) 

Nt->oo         V      o  o 

--~~^  ^M     1^  as3rmptotically  sufficient   for  determining  x>j   (unconditionally) 
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if 


lim  X  (ilJ,    mJ",    6^^  Y^  nS  =  ()>**  (iJ,   y") 


N  -x» 


We  can  summarize  the  results  of  this  section  as 


THEOREM  3.4:   Under  Assumptions  A.l,  A. 2,  A. 3,  A. 4  and  A. 5  micro  slope 

regression  coefficients  will  always  consistently  estimate  the  first  derivatives 

*  t  —  t 

of  the  macro  function  (p      (holding  0   constant)  if  and  only  if  V   is 

asymptotically  sufficient  for  determining  x^   (conditional  on  9   ). 

Similarly,  small  sample  counterparts  to  asymptotic  sufficiency 
can  be  defined.  


DEFINITION:   ^N   """^  sufficient  for  determining  x^   conditional  on 

6   if  X  can  be  written  as 
o 

X  (v^^  y^  ^  e^^  Y^  n^)  e  5  (v^^  e^^  y^,  n*")  for  an  N^ 

V   is  sufficient  for  determining  x^  if  x  can  be  written  as 

Obviously,  if  V   is  sufficient  for  determining  x^   (conditional  on  0  )  then 
V   is  asymptotically  sufficient  for  determining  x^   (conditional  on  6  ) ,   and 
so  all  of  our  results  hold  if  (small  sample)  sufficiency  obtains. 

We  close  this  section  by  presenting  a  theorem  and  corollary  which  can 
be  used  under  certain  conditions  to  characterize  asymptotic  sufficiency  as 
defined  above. 


THEOREM  3.5:   Assume  A.l,  A. 2,  A. 4,  A. 5  and  let 

**    t        r      ,       t   ^t,  _  ,    o   if  V",*^  ^  V_ 


P    (Ai A^t  I  V^,  y^,  0^)  =j    °   -'\     ^\ 

Up*   (A„^  I  u';,  eS 


if  v;  =  V 

^     ^      No 

P  (V   I  u  .  6  ) 
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denote  the  density  of  A^,  ...,  A^t  conditional  on  V^,  =  V  where  P  (V  '^  I  u  ^  6  ^^ 

^^  No         ^  N   '  %  '  o  ^ 


quotients 


is  the  marginal  density  of  V^\      and  let  e.  be  the  M  -  vector  with  i*^^ 
ponent  1  and  all  other  components  zero,  i  =  1,  ...,  M.   If  p*  and  P  are 
differentiable  with  respect  to  each  component  of  p^,  and  the  difference 

^4(p"(-  I  ^o'  ^  ^-ih>  0^)  -p*%  .  I  v^,  /  ,  el)) 

1  =  1,  . ..  ,  M 
are  all  bounded  by  integrable  functions  of  a/ ,  .  .  .  ,  A^'^  ,  v^  f  or  0  <  \h\    <  h. 


then 


\  ^  <^-  ^.  «o^  ^^  "'>  =  ^  («S  \'  I  vj  -  v„, 


where 


6.,  = 

N 


N  ^   "Si 
t 

* 


-  E 


Proof:   See  the  Appendix. 

We  immediately  obtain  the  obvious  corollary: 

^jnOROLLARY  3.6:   Under  the  conditions  of  Theorem  3.5, 

_t  ~  -t         ~~ , ^ 

^f       is  sufficient  for  determining  x   conditional  on  9   if  ~' ~ 

•^  No 

x^   and  Z  V   In  p  (A   |  y   ,  6  )  have  zero  covariance  conditional 

"tt  *        ^ _^ 

on  V^,   for  all  parameter  points.   If  this  covariance  converges  in  probabilitj 
to  zero  as  N  ^,  vJJ  is  asymptotically  sufficient  for  determining  ^^^ 
conditional  on  6 
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We  see  heuristically  that  micro  slope  coefficients  are  us.ful  in 


characterizing   the  macro   function  <})  if  V^^^  effectively  determines  all 

interaction  between  IL^   and  the  gradient  of  the  log  -  likelihood  function 

t 
I  V^  In  p  (A^   I  y^  ,  e^  )  in  a  large  sample,  ^s   discussed  shortly,  this 

can  occur  by  V^  completely  characterizing  either  x^*^  or  I  V   In  p  (A^  |jj  '^  9  ) 

V  V    o 

individually,  or  in  a  joint  way  in  the  sense  of  Corollary  3.6. 


3.2   Discussion  and  Examples 

The  concept  of  asymptotic  sufficiency  is  closely  related  to  the  notion 
of  correctly  specifying  the  predictor  variables  in  the  micro  relation.   To  see 
this,  consider  the  following  simple  errors-in- variables  model. 
Suppose  that 

X  J"  =   B  u  ^  +  s   ^ 
n  n  n 

J        t  t  t 

and  y        =   u        +   r 
n  n  n 


e 
2 


where  u^,    r^    ,    s^      are   independent  normal  variables    (as   functions   of   th 

underlying  aS   with        e    (u   '')    =  y ,    E    (r   *")    =   E    (s   ^    =  0 ,    VAR   (u   '')    =   a 
"  nn  n  n  u 

VAR   (r^-^)   =  a/   and  VAR   (s/)    =  a/.      Our  aim  is    to  study  ^^   as 

a  function  of  7^^   or  equivalently   <})*   (y,   o^ ,    a^    ,   o^^)    =  By  as   a  function 

of   y  =   E    (y      ) .      We  have   that 
n 

t  =  ^    ,-  t    I   -  t      ,.     „  2      _  2      „  2 -  t 


where 


and 


^     =  E   (\      \   y^   '   V,  o^   ,  a^   ,   o^  )   =  QXm  +  (^  -  3A)   y^' 

2  2  2 

A  =  a^      /    (O^     "^  *^u  ^*      ^1^°»    ^^  accordance  with  Theorem  3. 5, we  have 

^n'  =   e^   ^^n'  -   ^>    -   e   r/  +  s/ 

^tN/,     .—  t.—  t. 
^N      =  "2    (^    (^N      -    ^)    -    "^N    ^ 

a 
u 

Now,   we   can   easily   calculate 
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giving  both       gy      and  the  value  of  the  asymptotic  (downward)  bias 

in  the  least  squares  estimator  of  3.   Obviously,  sufficiency  holds  if  X  =  0 

(no  error  in  y   )  of  B  =  0  (No  structural  relation) . 

From  an  empirical  point  of  view,  the  concept  of  unconditional 
asymptotic  sufficiency  is  the  most  important  for  micro  analysis  of  macro 
functions.   In  a  sense,  all  micro-macro  statistical  modeling  is  conditional 
on  certain  unobserved  distributional  movements,  for  if  average  statistics 
were  observed  that  depicted  these  movements,  they  would  be  incorporated 

directly.   Moreover,  all  of  the  results  conditional  on  9   depend  crucially 

t  ft 

on  the  way  6   is  chosen  in  the  6  ->  u   reparameterization.   A  simple  ex- 
o  V 

ample  makes  this  point  concretely.   For  a  given  period  t  ,  take  A  to  be  a 

2 
scalar  random  variable  distributed  normally  with  mean  y  and  variance  a  . 

Suppose  that  x,  is  functionally  determined  by  A^  as 

^k  =  ^o  -^  ^1  \  +  ^2  \^ 

2 
This  is  an  exact  aggregation  model  if  x  is  explained  by  A^  and  A^  . 

However,  suppose  that  we  have  only  V  =  ZA  /N  =A^  as  data  along  with  x^ 

for  the  full  population.   We  therefore  regress  ,  on  A^  as  in  Xi^  =  a  +  b  A, 

and  denote  the  least  squares  estimate  of  b  as  b  ,   Given  the  true  model, 

K 

simple  specification  analysis  techniques . give  that 


plim  bj,  =  Y,  +  2y  M 

We  can  form  ^     and  x^  for  this  problem  as 

*      2  2    2 

<t>    (y,  a  ,  Y)  =  Yq  +  Y-j^y  +  Y2  (v  +  o  ) 

^  =  E  ("^  I  V^.  y.  ah   =  Yo  -^  ^1  \  ^  ^2  V  ^  ^2  V^  ^^ 

—  —  2 

Thus  A^  is  (asymptotically)  sufficient  for  determining  x»j  conditional  on  o  , 

and  we  find  that 
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Pli.  b^  =  Y^  +  2Y2  M  =  |i:  I  ,   2 


* 

in  accordance  with  our  results.   Alternatively,  suppose  that  we  parameterize 
the  normal  distribution  of  A^^  in  terms  of  the  mean  y  and  the  coefficient  of 


variation  ^  =  -.   In  this  case 


*   (y,  w.  Y)  =  Yq  +  Y,  M  +  Y.  (IJ^  +  co^  y^) 


"^'      ^  =  ^o  -^  ^1  ^^  +  ^2^'  +  ^2  V  -'^' 

Now  A^  is  not  asymptotically  sufficient  for  determining  x^  and  we  find  that 

P'i:  \  =  ^1  +  2Y2W  ^  Y,  4.  2Y2y  (1  +  0)2)  =  |l!  , 

Thus  all  the  conditional  results  weigh  heavily  on  exactly  which  parameters 

are  held  constant.   Because  of  this  property,  we  consider  mainly  results  of 

the  unconditional  type  in  the  rest  of  the  exposition. 

The  concept  of  asymptotic  sufficiency  as  presented  above  provides 

the  correct  foundation  for  using  single  period  micro  slope  estimates  to 

study  the  macro  function.  If  asymptotic  sufficiency  does  not  hold,  then  all 

the  slope  coefficients  provide  is  information  on  sampling  corrections  to  be 

applied  to  ^  when  deviations  of  V^  are  observed  (V^/*  in  our  notation)  . 

instead  of  the  full  structural  effect  (V^  /) .   m  this  sense,  absence  of 

of  this  assumption  severely  limits  the  usefulness  of  micro  data  analysis. 

For  example,  suppose  we  are  studying  the  relation  of  consumption  to  income, 

and  a  positive  coefficient  is  found  in  a  micro  regression  of  consumption  on 

income.   To  infer  that  mean  consumption  will  increase  with  increases  in  mean 

income  implicitly  places  bounds  on  V  (|,**.  the  unobserved  component  of 

^v 

V^^*  .   To  quantify  the  macro  effect  with  the  micro  coefficient  requires  full 
asymptotic  sufficiency  of  average  income  in  determining  average  consumption. 

At  this  point,  it  is  useful  to  discuss  two  sets  of  modeling 
assumptions  which  appear  as  polar  extremes  under  which  sufficiency  of  V  ^  in 
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—  t 


determining  x^  holds.   The  first  is  termed  functional  sufficiency,  where 


^N^  =  0  or 


-t  _I^  (\'>  y')      .   _,   , 


for  all  underlying  distribution  forms  p*.   This  is  just  a  restatement  of  the 

exact  aggregation  form,  and  so  under  the  general  conditions  of  Lau's  Theorem 

(2.12)  and  (2.13)  ,  f  (A   ,  Y  )  must  be  linear  in  v  (A  ) ,  as  well 


;  .  as  wen  as 
n 


^«  ^ 


xn  V^  . 

The  second  set  of  assumptions  is  termed  distributional  sufficiency. 

Suppose  that  f  (A   ,  y  )  is  unrestricted,  but  that  V   is  a  sufficient 

t  t        t        —  t 

statistic  for  0  in  the  distribution  of  A   ,  ...,  A^t.   Then  V   is  obviously 

sufficient  for  determining  x^  ,  as  x^  is  always  a  function  of  V   ,  y  and  N 

only.   In  addition,  (J)  can  be  a  nonlinear  function  of  y  .    Finally,  if  Sj^  =  0> 

and  Lau's  Theorem  is  applicable  to 

t  j^t 

Z  V   m  p*  (a/.  |y/,  e^)  =  E  (Z  in  p*  (A^^  ly/.  e/)  |V/) 
n=l   %  n=l 


we  obtain  that  p  must  be  a  member  of  the  exponential  family  of  distributions. 
The  topic  of  the  next  section  is  the  characterization  of  (])  when  p  is  a  member 
of  this  family. 

k.      Distributional  Sufficiency  and  Macro  Functions 

In  this  section  ve  present  a  methodology  for  estimating  second  order 
derivatives  of  (J>  with  respect  to  y^  from  cross  section  moments,  under  the 
assumption  that  p  is  a  member  of  the  exponential  family  of  distributions. 
This  methodology  in  principle  extends  to  derivatives  of  $*  of  all  orders; 
however  due  to  its  complexity  we  consider  only  second  order  derivatives  here. 

The  question  of  how  to  estimate  higher  order  derivatives  of  (J)*  in  the 
general  case  (that  is,  with  asymptotic  sufficiency  and  without  distributional 
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sufficiency)  is  still  open.   However,  the  formula  presented  "below  will  allow 
tests  of  exact  and  consistent  aggregation  models,  and  thus  extend  somewhat  beyond 
the  distributional  sufficiency  case. 

p(A|e  )  is  a  member  of  the  exponential  family  of  distributions  if  it  can 
be  written  in  the  form 

t.      ^        'M 


(l,.l)         P(Ale  )  =  C(e^)h(A)exp/'  I  %(e*)v^(A) 


where     C 


(0*)  =  ^/h(A)exp(j:Tr^(e*)v^(A))9Ay^ 


The  joint  density  of  the  popiilation  A  , ...  ,A^t  can  then  be  written  as 


N  t  N 

J^p(A^|e*)  =  C(e*)^'   n  h(A*)expATr^(e*)NV 

n=l        \m 


By  the  factorization  criterion,  'V     (and  N  )  are  sufficient  for  6  . 

Beyond  the  motivation  for  (i+.l)  given  in  section  3.2  based  on  Lau's  Theorem, 
the  appeal  of  the  form  (i;.l)  is  that  under  relatively  general  conditions,  if  a 
sufficient  statistic  T  of  dimension  M  <N  exists,  then  the  density  of  the  distri- 
bution must  have  the  form  (U.l).   Very  briefly,  if  the  range  of  variation  of  A 

does  not  depend  on  6  ,  a  continuously  differentiable  sufficient  statistic  T  for 

6  exists  and  p(A|e  )  is  continuously  differentiable  in  A  ajid  6   (as  well  as 
some  other  mild  regularity  conditions),  then  p(A[e  )  must  locally  have  the  form 
(l+.l).   If  p(A|e  )  is  further  assumed  to  be  analytic,  then  {k.l)   is  the  global 
form  of  the  density.   This  theorem  is  quite  complex,  and  to  adequately  discuss 

it  here  would  take  us  too  far  afield.  Therefore,  the  interested  reader  is  referred 

23 
to  references  in  the  statistical  literature. 

We  begin  o\xr  study  by  adopting 
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ASSUMPTION  A. 6:   p  can  be  written  as  a  member  of  the  exponential  family  in  its 

natural  parameterization : 

(1^_3)  pCaItt"^)  =  C(7T^)h(A)exp(5  Vm^^^) 


m=l 
where 

C(u*)  =  (/h(A)exp(^  Vm^^^)^"^ 
where  9  has  been  reparameterized  by  ir  =  (tt  ,...,tt)1 


(h.S)   holds  from  (U.l)  without  loss  of  generality  if  the  mapping  9  -*-('n",  (9  ),.,., 

""■,.(9  ))'  =  '"■  is  of  rank  M.  Therefore,  Assumption  A. 6  just  eliminates  constraints 
M 

across  tt  (9  ),  m  =  1,...,M  which,  from  an  empirical  point  of  view,  are  unnecessary 
at  the  outset. 

Two  useful  textbook  facts  about  the  form  (U.3)  are 


LEMMA  k.l:      Under  Assumption  A. 6,  the  natural  parameter  space 

r  =  {it  |p(Aj'R  )  is  a  density}  is  convex. 

LEMMA  k.2:      If  TpiA^  ,...,A^i)   is  a  function  for  which  the  integral 


N*        r  M 


/•••/  i|;(A*  ...,a\)  n  h(A*)exp 
1      N^  n=l   n 


m   mW 
■m=l 


*->     TTl       TT 


9A^  ,  .  . .  ,  3Aj^t 


exists  for  all  wr,  then  this  integral  is  an  analytic  function  of  tt  at  all 
interior  points  of  F,  and  derivatives  of  all  orders  with  respect  to  Tier  may 
be  passed  beneath  the  integral  sign  (for  discrete  exponential  families  this 
integral  is  replaced  by  a  sxim.  ) 

Proofs  of  these  lemmae  can  be  found  in  Lehmann  (1959).   They  allow  a  computational 
method  for  taking  derivatives  of  various  expectations 
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Recall,  as  in  earlier  sections,  that  we  denote 
ik.k)  <t>(Tr\Y^)  =  E(x'^|Tr*) 

M 
and  that         C(Tr*)  =  l/h(A)exp(  I  t^  v   (A))8a)~"'" 

m=l 

C(tt  )  appears  in  (i+.3)  as  Just  a  normalizing  factor  to  make  p(A[Tr  )  a  density. 

Both  (()  and  C  have  some  remarkable  properties,  however,  as  shown  in  the  following 

lemma : 


LEMMA  h.:>:     Under  Assumptions  A.l     and  A. 6,   all   derivatives   of  4"   and  In  C  with 
respect  to  tt     are  expressible  as  moments   of  the  x    ,v(A    )   distribution.      In 
particulEr,  we  have  for   C  that 


m 

2 

-  L^^n       =E((v^(A)  -  y^)(v   .(A)   -  y^)|Tr'^)  =  a*   .,  m,m'=l,...,M 
dTT   dTT   >.  m  mm  m      '  mm 

m     m 

mm       X, 

=  a ^n  ,        m,m',  X  =   1,. . . ,M 


and 


For  (})  we  have 


and 


9|^=E((x-<^    (^\Y'))(WA)-y^)k')  =  a^ 
m 

m     m 

=  a        > ,     m,m'     =  1, . . . ,M 

xmm 


Proof:      The   first   statement   follows   from  Proposition   i+.2.      The   formulae 

2k 
are  obtainable  by  direct  computation. 
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We  are  primarily  interested  in  the  behavior  of  ({((tt  ,y  )  with  respect  to 
changes  in  y  .  We  proceed  as  before  to  reparameterize  via  the  mapping 

y*  =  E(v^(A)|7t"^)  =  g^(TT*) 

In  viev  of  Lemma  h.k,   this  mapping  is  expressible  as 
{h.6)  y*  =  -V,.lnC(7T^)  =  g(Tr*) 

We  can  reparameterize  the  distribution  (^+.3)  in  terras  of  y  if  the  mapping  g 
is  invertible  M  ;  i.e.  if  the  differential  matrix  dg  is  nonsingular.  This 
matrix,  again  from  Lemma  h.k,   can  be  written  as 


9^1nC 


t     I  97r  a-TT  ^ 
TT  /    \   m  m 


the  covariance  matrix  of  v(A    ):      Thus,  Assumption  A. 2  is   now    (under  A. 6)    implied 
by  Assumption  A.l.      We  therefore   form 

t  -If    ts 

IT     =  g     (y^) 
ik.8)  P*(A|y*)  =  p(A|g-^(y^)) 

and  (f)   (y^,Y    )  =  <J)(g~   (y^),Y    ) 

We  are  now  inaposition  to  show  the  main  result  of  section  3.1  by  direct 
computation. 


THEOREM  h.k:      Under  Assumptions  A.l  and  A. 6  the  gradient  of  ^     with  respect  to 
y   is 

V 

V,  **(y\Y^)  =  (i^  )"V 
y    V       w   XV 

V 


and  so  is  consistently  estimated  by  the  micro  slope  regression  coefficients 
from  a  single  period  cross  section  data  source. 
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Proof:      By  the  chain  rile. 


Now 


V  4>*  =  (cg"^)V^  4) 

dg""^  =  (dc)'-^   =  {y}    )""''  and  hy  Lemma  l+.lt,  V  (f)  =  Z* 

W  fT        XV 


QED. 

We  can  similarly  calculate  all  higher  order  derivatives  of  (|)  with  respect 
to  y  as  functions  of  moments  of  the  x  ,v(A  )  distribution.   Because  these 
calculations  increase  greatly  in  complexity  as  the  order  of  the  derivarives  in- 
crease, we  present  only  the  second  derivative  calculation,  which  will  allow 
statistical  investigation  of  the  case  of  a  macro  function  linear  in  y  .We 

V 

first  require  some  new  notation  to  facilitate  the  formulae: 


^l       denotes  the  MxM  matrix  with  m,m'  element 


t 

jjnm"  (see  Lemma  h.k) 


I  =  1,... ,M 


_t  2 

li       denotes  the  MxM     matrix 

TTTT    ^  ITTTT'     '  MTTTT-' 

^y^ry   denotes  the  MxM  matrix  with  mjm"  element 


xw 

t 
a 

xmm 

~  2  2 

I  is  the  identity  matrix  of  order  M  and  finally,  P  denotes  the  M  xM  matrix 


^M  = 


11 


"Ml 


IM 


•  •  P 


MM 


^mm^  ^^  *^^  ^^  ^^r±x   with  m^m  element  1  and  remaining  entries  0, 
We  can  now  show 
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THEOREM  k.3:      The  matrix  of  second  order  derivatives  of  (}>  vith  respect  to  u 

V 

evaluated  at  period  t,  written  as  V     *    ,  can  be  computed  as 
(^•9)  V^  <i>*  =  _(Z*    )-^n     l\MJI^     ]P  [l    ®E~  S*   1 

y  W     TTTT   M^  W  -^  M"-  M     VV  XV-^ 

w    xw  w 


/  2  *  a^A* 

V  (j)  is  the  MxM  matrix  with  m,m'  element  -r^-? 

\  ^v  ^^n,^^^^^ 

^  mm 


t  t. 


The  proof  is  "by  direct  computation,  but  is  relatively  complex  so  we  present 
a  isketch  of  it  in  the  Appendix. 


The  formula  (U.9)  for  second  derivatives  of  <})  is  sufficiently  complex 

to  warrant  illustration  by  a  simple  example.   Suppose  M  =  1 ,  or  that  A  is 

distributed  according  to 

t,   ^/  t 


with  IT     a  scalar  parameter,   and 


p(A|Tr^)  =  C(7r'')h(A)exp(TTV^(A)) 

parameter,  and 
E(x|iT  )  =  (()(Tr  ,Y  )  =  (j)  (y^,Y  ) 


where  y  is  the  (scalar)  mean  of  v  (A).   Now,  in  accordance  with  Theorem  h.k 
we  find  that 

M li  3ll_  3^  „t  ,  t  ,-1 xl  _  plim  b 

SPi   8tt  Sy,   ^xl^^ll^   "  t  -  K-^   ^ 
^       ^  '^ll 

where  b  is  the  estimated  coefficient  from  the  cross  section  regression 

equation  x  =  a  +  bv^ (A^  ) .   Now 

2  *    2   /    \  ?      ? 

By  Lemma  h.k,  we  have 


ifl  =  a*  -^  =  -i—       M.  =     "t 

„   2       ^xll'      9y-  t    '      87r       ^xl 

3Tr  1       a^^ 
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9^u 

and  so  we  must   find  —.    Since 

Stt         3y 
by  differentiation  with  respect  to  y     we   get 


91^2       av^i  g^^2     3. 


^\  ,.. 


^^  9.2     P^J  4ll 


Insert ing  these  values  into    (I+.IO)  gives 

(4.11)  ^  =  -^^  -  -^pii 

^^1  (4l)  (4l) 

which  agrees  with  (^+.9)  for  M  =  1  . 

As  we  have  shown,  we  can  express  the  second  order  derivatives  4)  in 
terms  of  moments  of  the  underlying  exponential  family  population  density. 
Estimating  these  moments  by  their  sample  counterparts  in  a  cross  section  data 
base  will  allow  consistent  estimation  of  the  second  derivatives  in  that  time 

period.  Asymptotic  inferences  using  these  estimates  are  possible  by  standard 

25  * 

methods.    Thus,  in  particular,  we  can  test  whether  (})   is  a  linear  function 

^  t  26 
of  y  . 

V 

In  addition,  one  can  easily  show  that  if  x  =  f(A  ,Y  )  has  either  the 

n      n 

exact  aggregation  form  (linear  in  v(A  )  with  constant  coefficients)  or  the 

consistent  aggregation  form  (linear  with  coefficients  that  vary  independently 

of  v(A*^))  that  the  expression  in  (4.9)  is  identically  zero  without  distributional 
n 

sufficiency.   Thus,  (4.9)  can  be  used  to  test  whether  either  of  these  model 
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forms  holds,  although  it  no  longer  has  the  second  derivative  interpretation. 
In  this  sense  the  usefulness  of  (4.9)  extends  beyond  the  distributional 
sufficiency  case. 


5.   Conclusion 

The  major  res\ilt  of  this  paper  is  that  under  asymptotic  sufficiency, 
slope  regression  coefficients  will  consistently  estimate  the  first  derivatives 
of  the  true  macro  function.   Asymptotic  sufficiency  is  a  general  concept 
which  appears  to  tinderlie  all  quantitative  inferences  made  from  cross  section 
data  about  macro  functions.  A  more  general  study  can  only  be  made  with  additional 
information,  such  as  micro  data  from  several  time  periods,  as  in  panel  or 
reinterview  studies. 

In  addition  to  the  regression  result ,  we  have  shown  that  if  the  average 
explanatory  variables  are  sufficient  for  the  underlying  population  distribution 
parameters,  then  in  principle  (in  the  case  of  a  population  density  of  the 
exponential  family)  one  can  empirically  characterize  macro  f\mction  derivatives 
of  all  orders  using  cross  section  data.   Under  this  additional  structure, 
one  can  actiially  test  whether  the  macro  function  is  linear,  quadratic  or 
some  higher  order  nonlinear  function.   These  techniques  extend   to  tests  of 
linear  aggregation  schemes  such  as  exact  and  consistent  aggregation  models. 

The  main  appeal  of  these  techniques  is  that  they  allow  an  empirical 
characterization  of  macro  functions  usir^g  micro  data.   In  addition,  even  if 
the  true  macro  function  is  linear,  the  independent  effects  of  the  average 
explanatory  variables  may  be  difficult  to  ascertain  because  of  trending  behavior 
or  other  data  problems  (referred  to  as  multicollinearity ) .   As  Justified 
here,  micro  data  can  be  used  to  estimate  these  effects,  where  they  are  generally 
more  visible.   In  this  sjdrit ,  if  one  want  to  approximate  a  macro  function 
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to  the  first  order  using  average  and  micro  data,  then  an  exact  aggregation  scheme 
should  be  assumed,  as  the  estimates  obtained  from  each  data  source  will  coincide 
in  large  samples ,  thus  allowing  one  to  take  advantage  of  the  increased  data 
input  in  terms  of  increased  precision  of  the  final  estimate  values.   Moreover, 
the  macro  function  coefficients  can  be  allowed  to  vary  if  a  structtiral  change 
is  indicated  from  an  additional  cross  section  data  source. 

Future  research  in  this  area  can  be  directed  to  methods  for  distinguishing 
whether  a  particular  data  configuration  is  consistent  with  either  functional 
or  distributional  sufficiency.   Also,  there  is  a  need  for  techniques  to 
characterize  nonlinearities  in  (f)  when  neither  of  the  above  types  of  sufficiency 
hold.  Finally,  and  most  importsjit ,  are  tests  of  asymptotic  sufficiency, 
from  which  the  relevance  of  any  of  these  techniques  can  be  considered. 

The  results  of  this  research  constitute  a  first  step  into  the  area 
of  empirically  characterizing  macro  function  with  micro  data.   Hopefully 
they  will  help  end  the  practice  of  neglecting  distributional  issues  in  the 
study  of  macro  data,  a  practice  which  is  now  so  prevalent. 
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Appendix:   Omitted  Proofs 

Proof  of  Lemma  3.1  b)  I 

Lemma  3.1  b)  is  shown  as  the  result  of  combining  Lemma  3.1  a)  with  two  other    | 

propositions,  the  first  shown  in  Rao  (1973)  Section  6.2  a; 

Lemma  AP. 1 

Let  T  be  an  M  dimensional  statistic  (t   ,...,T  )'  such  that  the  asymptotic 

distribution  of  v1T(t,„  -  Y-i  )>•••»  *^(Tw^  -  Yw)  is  M-variate  normal  with  mean 

zero  and  variance  covariance  matrix  T.    .  Further,  let  g(T   ,...,!   ,  N)  be  a 

function  which  is  totally  dif ferentiable  in  T,.,, .  .  .  ,t  „,,  and  that  V   g  ^  G  ?*  0 

IN       MN  T., 

N 

as  both  N-*"-  and  T  ->(y  , 'Yvr^'  '^  ^'      '^^^^   ^he  asymptotic   distribution  of 

v¥(g(T^^,...,Tj^,N)  -  g(Y^,...,Y^,N)) 
is  the  same  as  that  of 

v^iT^   -  Y)'G 
that  is,  normal  with  mean  zero  and  variance 

■k 

Moreover,  gCY-, ,  •  •  •  ,Yw>N)  may  be  replaced  in  the  above  by  g  (Y-,,...,Yw)  if 


Lemma  AP . 2 


lim  v¥[g(Y,  ,...,Yw,N)  -  g  (Y-,,...,Y^,)J  =  0 


lim  /N(x(y  ,y  ,eo,Y,N)  -  (j)*(y  ,e<,,Y))  =  0 


Proof:   Fix  N  and  consider 

e[v^(1^  -  (f)*(y^,eo,Y))  +»^(i(V^,y^,eo,Y,N)  -  x(y^,y^,e<,,Y.N))J 

=  E(v¥(^  -  x(V^,y^,eo,Y,N)))  +  v/N(x(y^,y^,0o,Y,N)  -  <i>   (y^.O^Y)) 

=  0  +  /N(x(y^,y^,eo,Y,N)  -  4)*(y^,eo,Y)) 

Now  as  N-x»  ,  the  first  expectation  approaches  zero  by  virtue  of  Lemma  3.1  a) 
and  Lemma  AP.l  applied  to  x.   Thus 
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lim  v4^(x(u    ,U    ,eo,Y,N)    -  <t>  *(u    .Oo.y))   =  0  QED 

N-xx> 

Applying  Lemma  AP.l  to  x  in  view  of  Lemmae  3.1  a)  and  AP.2  gives  Lemma  3.1  b) . 


Proof  of  Theorem  3.5 

Conditions  (3.11)  ii)  allow  differentiation  of 

~t     r-t   **..t     .t  i„   t  at.    ^,t     ^.t 
x^  =  Jxj^  p   (A^,...,A^t|Vo.y^,eo)    9A^,...,8A^t 

under  the  integral  sign,  which  gives 

^   t 

N 

where       ^^  =  I  V   lnp*(A^|/,e^  -  V  lnP(^|y^.eb 
n=l   V  v 

Theorem  3.5  is  shown  if 

(AP.l)     ECi/^IvJ  =  Vo.y^.eb  =  0 
or  E(EV  m  p*(A^|y^,e^  |v^  =  Vo,y^,eo)=v^  inP(VoIy^,e^) ). 

V  V 

By  conditions (3. 11)  1)  we  can  differentiate 

1  =  E(1|V^  =  Vo,y*^,9o)  under  the  integral  sign,  which  gives 
(AP.l)  above  QED 

Proof  Sketch  for  Theorem  4.5: 

Denote  the  components  of  it  =  g   (y  )  by  g   (y^)  =  (g-j^  (y^) gj^  (y^))«' 

As  in  Theorem  4.4     /a**  \      /^g.~^       •   •   •    ^&^~^\     i'^^'^ 

'^1    \  :    '   i 

.     w  :    i 


-1 


i  : 


Therefore 
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(AP.2)   9^(f'^   ^  y 


8y.  9y. 
1  J 


-1 


-^    ,     3lT 
m  =1        m 

^\ 

sy^ayj 

M        M 
in=l  m  =1 

m     m 

'^vi\  Pv^ 


9M. 


^3 


The  second  term  of  the  above  (the  double  sum)  is  expressible  in  full  matrix 
format  as 


^ 

1 

3Sl 

=^1 

• 

• 

• 
• 

1 

Bgl 

*» 

M 

_^ 

3g 


-n 


M 


3y. 


3g. 


-1 


M 


8y 


M 


'  2 

,  ^  2 
9tt, 


i^ 


J  L 


I  au^^TT^ 


.  ^ 


3tt 


M 


3g 


-1 


9g. 


-1 


M 


3y. 


9g 


-1 


''^^ 


3y- 


.3g 


-1 


M 


3y. 


M 


which  by  Lemma  4.3  equals 


v-1 


(AP.3) 


w 


xw 


giving  the  second  term  in  the  statement  of  the  theorem.   Now,  if  B  =  b..(y) 
is  an  MxM  matrix  of  functions  of  y  then  we  denote  by  d  B  the  matrix 

y 


d  B  = 

y 


9b ^.  (y) 


3y 
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The  first  term  of  \^.2)  is  expressible  in  matrix  format  as 


^.4) 


d   (dg  "*■)  V  (}),•••,  d   (dg  ■"")  V  (f) 


Now,  in  order  to  evaluate  d   (dg  ),  m  =  1,  •  • ,  M,  we  use  the  relation 

m 

(dg  )   (dg)  =  I   so  that  (if  0  is  an  MxM  matrix  of  zeros) 


d    (dg  h    dg  +  dg 


or 


"  (v^^)= " 


i   (dg  ^)  =  -dg  ^  (d   (dg)\  dg  ^ 

^m  \  ^m    / 


M 


m  =  1 , . . .  ,M 


A 


Now,   if   2  denotes   the  MxM  matrix  with  i,    i    element 

miTTi" 


m 


97T.    3tt. 

1      J 


,  we  express 


d        (dg)    as 
m 


t 


~         ,        -1      .    .    .  .        -1 

^iTTTT    U    ^       »     *     *     *'    S>fn-,T      y    ^ 
m  m 


so  that 


(AP.5)  d^   (dg   ^)   =  -dg   ^ 

m 


■~  -1  ■ 

^ITTTT       y    ^        .      •     •     •'    gMlTTT    %    ^ 

m  m 


•^1  dg-^ 


The  proof  is  completed  by  inserting  ^.^  into  C^P.4),  making  the  associations 


-1 


dg  ^  =  (Z  ^)    ;  V  (()  =  E  ^ 


W  IT    ^    XV 


g    =  Q.  ;   m  =  1, 


,  M 


fV"'-  •  ■  ■'  w]- ''''  ■[-] 


-1 


and  rewriting  the  whole  expression  in  terms  of  U 


7TTT 


QED 


-34- 


NOTES 

*   I  wish  to  thank  Jerry  Green  for  helpful  discussion  during  the  early 
stages  of  this  work.   Also  helpful  were  D.  Schmalensee,  F.  Fisher 
and  D.  McFadden.   All  errors  are,  of  course,  the  responsibility  of  the 
author. 

1.  One  of  the  reasons  Friedman's  book  The  Theory  of  the  Consumption  Function 
is  so  masterful  is  that  the  distributional  foundation  is  clearly  stated 
and  investigated  empirically  with  both  macro  and  micro  data,  although 
not  using  pooled  methods  as  advocated  here.   Other  early  works  in  demand 
analysis  which  estimated  income  elasticities  from  cross  section  data 

and  applied  them  to  time  series  analysis  were  Wold  (1953)  and  various 
work  of  Stone,  although  these  authors  did  not  use  aggregating  models 
specifically.   A  recent  demand  application  of  an  exact  aggregation 
model  is  Jorgenson,  Lau,  and  Stoker  (1979). 

2.  This  critique  applies  equally  well  to  studies  of  aggregate  variables 
such  as  national  income,  total  personal  consumption  expenditures,  etc. 

3.  Ideally,  one  requires  micro  data  from  each  period  of  the  time  series 
data  in  order  to  study  the  influence  of  all  distributional  movements. 
Unfortunately,  this  much  data  is  generally  not  available.   Some  exceptions 
are  the  longitudinal,  or  panel  data  sets  now  available  on  wage  rates, 
hours  worked  and  demographic  characteristics  across  families  and  over  time. 

4.  The  only  exception  known  to  this  author  is  Friedman's  permanent  income  - 
permanent  consumption  model.   See  the  first  example  of  Section  3.2  (errors 
in  variables)  for  illustration  of  this  fact. 

5.  p  may  just  be  taken  as  the  denisty  of  the  sample  distribution  in  the 
population.  However,  with  N  sufficiently  large,  p  may  be  taken  as  a  con- 
tinuous approximation  to  this  density.   We  utilize  this  framework  in  order 
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to  allow  structure  to  be  given  to  the  population  configuration  {A  |n=l,...,N  } 

n 

via  p(A{6  ) . 
6.   An  alternative  formulation  of  this  set-up  is  to  take  x  ,  A  as  having  a 


joint  distribution  in  the  population.   Then  we  would  study 

\     =  E(x^  |A^)  =  f(A^,Y^ 
where  y  contains  parameters  of  this  joint  distribution.   All  analysis  in 

the  text  is  then  valid  for  x   in  place  of  x  ,  although  a  bit  must  be  said 

n  n 

about  the  relation  of  x   to  x^. 

7.  Of  course,  if  common  prices  affect  the  underlying  population  distribution 
(e.g.  the  distribution  of  earnings)  they  may  enter  as  elements  of  G  .   Also 
if  prices  vary  across  the  population,  they  are  expressible  as  components  of 
v(A). 

8.  If  this  is  not  true,  we  can  always  invert  for  a  subset  of  y^,...,iJ  .   The 
full  invertibility  assumption  does  not  affect  the  character  of  our  results, 
but  just  simplifies  the  notation.   Similarly,  we  could  allow  for  M  >  L. 

9.  See  Rao  (1973),  section  2c. 3  for  a  statement  of  the  Weak  Law  of  Large  Numbers. 

10.  Each  index  k  of  the  random  sample  has  a  counterpart  n  index  in  the  population 
(n=l,...,N  )  numbering.   We  utilize  the  k  indices  only  when  discussing 
statistics  of  the  random  sample. 

11.  Typical  numbers  for  a  study  of  U.S.  family  demand  behavior  are  N  =  70  million 
for  1972,  with  a  budget  study  of  size  K  =  10,000. 

12.  See  also  MacDonald  and  Lawrence  (1978)  for  a  recent  contribution. 

13.  In  addition,  see  Green  (1964). 

14.  This  approach  was  introduced  by  Theil  (1954);  see  also  Theil  (1975),  chapter 
5  and  Green  (1964) . 

15.  V-^  X  represents  the  gradient  of  x  with  respect  to  ^j    ;    i.e.  the  M-vector  with 

N 


,th 

i   component 


9x 

9V 
iN 


Vj^,y^.eo,Y,N 
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I 


16.  Rao (1973),  section  2c  is  an  excellent  reference  for  these  theorems;  also, 
see  section  6a  for  some  useful  corollaries. 

17.  It  is  useful  to  point  out  that  our  underlying  population  assumptions  give 
b  a  slightly  different  as)rmptotic  distribution  than  in  the  standard  linear 

/\ 

model.   In  particular,   vic(b  -  G(y  >6oiY))  approaches  a  normal  vector  as 

K.        V 

K-w>  with  mean  zero  and  variance  covariance  matrix 

b     w     (.xv)  (xv)   w 

where  T,  ,°    .  ,      x  is  the  matrix  with  mm'  element 
(xv)  (xv) 

E[((x*^»  -  <t,*)(v^(A^»)  -  ^    )    -  c)    ' 
m        m     xm 

((x^°  -  **)(v^.(a''"')  -  y  .)  -  a_.)] 
m         m      xm 

T.,    will  correspond  to  the  usual  expression  (i.e.  OL  ,  a   is  residual 

b  w 

^2  t      t       t    t 

variance  if  there  is  a  zero  correlation  between  a   and  (v.(A  ")  -  y  ")  (v.  (A,  *')-p.  ° 

for  i,j 1,...,M,  where  u^  =  xf"  -  xf"  -  (v(A^')  -  v5'*)''b  .   Use  of  the  standard 

estimators  may  provide  an  adequate  approximation  to  Z,  if  the  sample 

b 

counterparts  to  these  correlations  are  small. 

18.  The  definition  of  uniform  convergence  can  be  found  in  Apostol  (1967), 
p.  424  and  Buck  (1965),  p.  180-2. 

19.  This  standard  result  of  analysis  is  available  in  most  books  on  advanced 

calculus,  c.f.  Buck  (1965),  section  4.2  (Theorem  21  in  particular). 

2 

20.  For  instance,  see  the  second  example  of  section  3.2  where  a      is  set  to  a 

_  * 

fixed  constant.   In  this  case  A^  is  sufficient  for  y  and  (^      is  a  quadratic 

function  of  y . 

21.  See  Rao  (1973),  section  2d. 3,  or  Ferguson  (1967),  section  3.3,  Theorem  1. 

22.  The  exponential  family  form  (4.1)  is  quite  general.   Examples  of  univariate 

2 
distributions  expressible  in  this  form  are  the  normal  (\i  ,0   ),  Poisson  (y), 

negative  binomial  (r,G),  the  gamma  distributions  and  the  beta  distributions. 
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Examples  of  multivariate  distributions  expressible  in  this  form  include  the 
normal  with  mean  y  and  variance  covariance  matrix  Z.   Distributions  which 
are  not  of  the  form  (4.1)  include  the  uniform  and  Cauchy  distributions.   See 
Ferguson  (1967)  for  more  details. 

23.  This  theorem  is  mentioned  in  Ferguson  (1967),  p.  129  and  Lehmann(1959) ,  p.  51 
without  proof.  The  original  references  to  its  proof  are  Koopman  (1936) ,  Darmois 
(1935)  and  Pitman  (1936),  prompting  some  authors  to  refer  to  the  exponential 
form  as  the  Koopman-Pitman-Darmois  form.   An  excellent  paper  that  proves  the 
theorem  in  more  generality  than  is  needed  here  (with  the  original  result  as 
corollary)  is  Barankin  and  Maitra  (1963). 

24.  Actually  the  formulae  involving  the  first  and  second  order  derivatives  of 
-InC  appear  as  an  exercise  in  Lehmann  (1959),  p.  58,  problem  14. 

25.  This  is  because  the  formula  (4.9)  is  a  continuous  and  dif f erentiable  function 
of  the  moments  comprising  it.   "Standard  methods"  refer  to  applications  of 

theorems  such  as  Lemma  AP.l. 

2  * 

26.  Of  course  V  d>  =  0  is  only  a  necessary  condition  for  linearity,  but  still 

its  failure  is  reason  to  reject  linearity. 

27.  Although  formula  (4.9)  is  directly  estimable  from  cross  section  moments,  it 
would  be  useful  if  (4.9)  could  be  related  to  simpler  statistics,  such  as 
regression  coefficients.   In  this  sense  it  is  easily  shown  that  if  M  =  1, 


performing  the  micro  regression 


\'  =  ^o-^Vi(4°>-^v^(4°»' 


gives           ^    a  '-a  °  -  a°.o^^ 
,.   -,     xll  11    111  11 
pllm  C.  =  — 


■=-     '     "u^uii-  tom>'-<"u'' 


which  is  clearly  proportional  to  (4.11),  and  thus  provides  an  easily  com- 
putable test  of  (4.11)  equalling  zero  (although  bear  in  mind  stochastic 
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structure  differences,  as  in  fn.  17).  The  natural  conjecture  is  that  in- 
cluding all  squared  and  cross-product  terms  in  a  micro  regression  produces 
a  matrix  proportional  expression  to  (4.9).   Unfortunately,  proving  or 
disproving  this  result  is  a  computational  nightmare,  and  to  date  the  author 
has  not  completely  resolved  this  problem. 
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